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Abstract 



In recent years, many improvements to backtracking algorithms for solving constraint 
satisfaction problems have been proposed. The techniques for improving backtracking al- 
gorithms can be conveniently classified as look-ahead schemes and look-back schemes. Un- 
fortunately, look-ahead and look-back schemes are not entirely orthogonal as it has been 
observed empirically that the enhancement of look-ahead techniques is sometimes counter- 
productive to the effects of look-back techniques. In this paper, we focus on the relationship 
between the two most important look-ahead techniques — using a variable ordering heuris- 
tic and maintaining a level of local consistency during the backtracking search — and the 
look-back technique of conflict-directed backjumping (CBJ). We show that there exists a 
"perfect" dynamic variable ordering such that CBJ becomes redundant. We also show 
theoretically that as the level of local consistency that is maintained in the backtracking 
search is increased, the less that backjumping will be an improvement. Our theoretical 
results partially explain why a backtracking algorithm doing more in the look-ahead phase 
cannot benefit more from the backjumping look-back scheme. Finally, we show empirically 
that adding CBJ to a backtracking algorithm that maintains generalized arc consistency 
(GAC), an algorithm that we refer to as GAC-CBJ, can still provide orders of magnitude 
speedups. Our empirical results contrast with Bessiere and Regin's conclusion (1996) that 
CBJ is useless to an algorithm that maintains arc consistency. 

1. Introduction 

Constraint satisfaction problems (CSPs) are a generic problem solving framework. A con- 
straint satisfaction problem consists of a set of variables, each associated with a domain of 
values, and a set of constraints. Each of the constraints is expressed as a relation, defined 
on some subset of the variables, denoting the consistent value assignments that satisfy the 
constraint. A solution to a CSP is an assignment of a value to every variable, in such a way 
that every constraint is satisfied. 

Constraint satisfaction problems are usually solved by search methods, among which 
the backtracking algorithm and its improvements are widely used. The techniques for 
improving backtracking algorithms can be conveniently classified as look-ahead schemes 
and look-back schemes (Dechter, 1992). Look-ahead schemes are invoked whenever the 
algorithm is preparing to extend the current partial solution. Look-ahead schemes include 
the functions that choose the next variable to be instantiated, choose the next value to 
give to the current variable, and reduce the search space by maintaining a certain level of 
local consistency during the search (e.g., Bacchus & van Run, 1995; Bessiere & Regin, 1996; 



©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved. 



Chen & van Beek 



Haralick & Elliott, 1980; McGregor, 1979; Nadel, 1989; Sabin & Freuder, 1994). Look- 
back schemes are invoked whenever the algorithm encounters a dead-end and prepares for 
the backtracking step. Look-back schemes include the functions that decide how far to 
backtrack by analyzing the reasons for the dead-end (backjumping) and decide what new 
constraints to record so that the same conflicts do not arise again later in the search (e.g., 
Bruynooghe, 1981; Dechter, 1990; Frost & Dechter, 1994; Gaschnig, 1978; Prosser, 1993b; 
Schiex & Verfaillie, 1994). 

A backtracking algorithm can be a hybrid of both look-ahead and look-back schemes 
(Prosser, 1993b). In this paper, we focus on the relationship between the two most impor- 
tant look-ahead techniques — using a variable ordering heuristic and maintaining a level of 
local consistency during the backtracking search — and the look-back technique of conflict- 
directed backjumping (CBJ) (Prosser, 1993b). Unfortunately, these look-ahead and look- 
back schemes are not entirely orthogonal as it can be observed in previous experimental 
work that as the level of consistency that is maintained in the backtracking search is in- 
creased and as the variable ordering heuristic is improved, the effects of CBJ are diminished 
(Bacchus & van Run, 1995; Bessiere & Regin, 1996; Prosser, 1993a, 1993b). For example, it 
can be observed in Prosser's (1993b) experiments that, given a static variable ordering, in- 
creasing the level of local consistency maintained from none to the level of forward checking, 
diminishes the effects of CBJ. Bacchus and van Run (1995) observe from their experiments 
that adding a dynamic variable ordering (an improvement over a static variable ordering) 
to a forward checking algorithm diminishes the effects of CBJ. In their experiments the 
effects are so diminished as to be almost negligible and they present an argument for why 
this might hold in general. Bessiere and Regin (1996) observe from their experiments that 
simultaneously increasing the level of local consistency even further to arc consistency and 
further improving the dynamic variable ordering heuristic diminishes the effects of CBJ 
so much that, in their implementation, the overhead of maintaining the data structures for 
backjumping actually slows down the algorithm. They conjecture that when arc consistency 
is maintained and a good variable ordering heuristic is used, "CBJ becomes useless". 

In this paper, we present theoretical results that deepen our understanding of the rela- 
tionship between look-ahead techniques and the CBJ look-back technique. We show that 
there exists a "perfect" dynamic variable ordering for the chronological backtracking algo- 
rithm such that CBJ becomes redundant. The more that a variable ordering heuristic is 
consistent with the "perfect" heuristic, the less chance CBJ has to reduce the search effort. 
We also show that CBJ and an algorithm that maintains strong ^-consistency in the back- 
tracking search are incomparable in that each can be exponentially better than the other. 
This result is refined by introducing the concept of backjump level in the execution of a 
backjumping algorithm and showing that an algorithm that maintains strong ^-consistency 
never visits more nodes than a backjumping algorithm that is allowed to backjump at most 
k levels. Thus, as the level of local consistency that is maintained in the backtracking search 
is increased, the less that backjumping will be an improvement. Together, our theoretical 
results partially explain why a backtracking algorithm doing more in the look-ahead phase 
cannot benefit more from the backjumping look-back scheme. Our results also extend the 
partial ordering of backtracking algorithms presented by Kondrak and van Beek (1997) to 
include backtracking algorithms and their CBJ hybrids that maintain levels of local con- 
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sistency beyond forward checking, including the important algorithms that maintain arc 
consistency. 

We also present empirical results that show that, although the effects of CBJ may 
be diminished, adding CBJ to a backtracking algorithm that maintains generalized arc 
consistency (GAC), an algorithm that we refer to as GAC-CBJ, can still provide orders 
of magnitude speedups. Our empirical results contrast with Bessiere and Regin's (1996) 
conclusion that CBJ is useless to an algorithm that maintains arc consistency. 

2. Background 

In this section, we formally define constraint satisfaction problems, and briefly review local 
consistency and the search tree explored by a backtracking algorithm. 

2.1 Constraint Satisfaction Problems 

Definition 1 (CSP) An instance of a constraint satisfaction problem is a tuple P = 
(V,V,C), where 1 

• V = {x\, . . . , x n } is a finite set of n variables, 

• V = {dom(xi), . . . , dom(x n )} is a set of domains. Each variable x £ V is associ- 
ated with a finite domain of possible values, dom(x). The maximum domain size 
max xe \;\dom(x) \ is denoted by d, 

• C = {C\, . . . , C m } is a finite set of m constraints or relations. Each constraint C £ C 
is a pair (vars(C) , rel(C)) , where 

— vars(C) = {xi 1 , . . . , X{ r . } is an ordered subset of the variables, called the con- 
straint scope or scheme, the size of vars(C) is known as the arity of the con- 
straint. If the arity of the constraint is equal to 2, it is called a binary constraint. 
A non-binary constraint is a constraint with arity greater than 2. The maximum 
arity of the constraints in C, maxcec \ v ars(C) \, is denoted by r, 

— rel(C) is a subset of the Cartesian product dom{xi x )x- ■ ■xdom(xi r .) that specifies 
the allowed combinations of values for the variables in vars(C). An element of 
the Cartesian product dom(xi x ) X • • • X dom(xi r .) is called a tuple on vars(C). 
Thus, rel(C) is often regarded as a set of tuples over vars(C). 

In the following, we assume that for any variable x £ V, there is at least one constraint 
C G C such that x £ vars(C). By definition, a tuple over a set of variables X = {x\, . . . , Xj~} 
is an ordered list of values (ai, . . . , a^) such that a 8 - £ dom(xi), i = 1, . . . , k. A tuple over X 
can also be regarded as a set of variable-value pairs {x\ <— a\, . . . , Xj~ <— a^}. Furthermore, 
a tuple over X can be viewed as a function t : X — > L) x ^xdom(x) such that for each variable 
x £ X , t[x] £ dom(x). For a subset of variables X' C X , we use t[X'] to denote a tuple over 
X' by restricting t over X' . We also use vars(t) to denote the set of variables for tuple t. 

1. Throughout the paper, we use n, d, m, and r to denote the number of variables, the maximum domain 
size, the number of constraints, and the maximum arity of the constraints in the CSP, respectively. 
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An assignment to a set of variables X is a tuple over X . We say an assignment t to X 
is consistent with a constraint C if either vars(C) ^ I or £[uars(C)] £ rel(C). A partial 
solution to a CSP is an assignment to a subset of variables. We say a partial solution is 
consistent if it is consistent with each of the constraints. A solution to a CSP is a consistent 
partial solution over all the variables. If no solution exists, the CSP is said to be insoluble. 
A CSP is empty if either one of its variables has an empty domain or one of its constraints 
has an empty set of tuples. Obviously, an empty CSP is insoluble. Given two CSP instances 
Pi and P2, we say Pi = P2 if they have exactly the same set of variables, the same set of 
domains and the same set of constraints; i.e., they are syntactically the same. 

Definition 2 (projection) Given a constraint C and a subset of variables S C vars(C), 
the projection ttsC is a constraint, where vars(irsC) = S and rel(irsC) = {t[S] \ t £ 
rel(C)}. 

Definition 3 (selection) Given a constraint C and an assignment t to a subset of vari- 
ables X C vars(C), the selection a t C is a constraint, where vars(a t C) = vars(C) and 
rel{a t C) = {s \ s[X] = t and s £ rel(C)}. 

2.2 Local Consistency 

An inconsistency is a consistent partial solution over some of the variables that cannot be 
extended to additional variables and so cannot be part of any global solution. If we are 
using a backtracking search to find a solution, such an inconsistency can lead to a dead end 
in the search. This insight has led to the definition of properties that characterize the level 
of consistency of a CSP and to the development of algorithms for achieving these levels 
of consistency by removing inconsistencies (e.g., Mackworth, 1977a; Montanari, 1974), and 
to effective backtracking algorithms for finding solutions to CSPs that maintain a level of 
consistency during the search (e.g., Gaschnig, 1978; Haralick & Elliott, 1980; McGregor, 
1979; Sabin & Freuder, 1994). 

Mackworth (1977a) defines three properties of binary CSPs that characterize local con- 
sistencies: node, arc, and path consistency. Mackworth (1977b) generalizes arc consistency 
to non-binary CSPs. 

Definition 4 (arc consistency) Given a constraint C and a variable x £ vars(C), a 
value a £ dom(x) is supported in C if there is a tuple t £ rel(C), such that t[x] = a. t 
is then called a support for {x <— a} in C . C is arc consistent if for each of the variables 
x £ vars(C), and each of the values a £ dom(x), {x <— a} is supported in C . A CSP is arc 
consistent if each of its constraints is arc consistent. 

Freuder (1978) generalizes node, arc, and path consistency, to ^-consistency. 

Definition 5 (^-consistency) A CSP is k-consistent if and only if given any consistent 
partial solution over k — 1 distinct variables, there exists an instantiation of any k th variable 
such that the partial solution plus that instantiation is consistent. A CSP is strongly k- 
consistent if it is j -consistent for all 1 < j < k. 
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For binary CSPs, node, arc and path consistency correspond to one-, two- and three- 
consistency, respectively. However, the definition of ^-consistency does not require the CSP 
to be binary and arc consistency is not the same as two-consistency for non-binary CSPs. 
A strongly ra-consistent CSP has the property that any consistent partial solution can be 
successively extended to a full solution of the CSP without backtracking. 

2.3 Search Tree and Backtracking Algorithms 

The idea of a backtracking algorithm is to extend partial solutions. At each stage, an unin- 
stantiated variable is selected and assigned a value from its domain to extend the current 
partial solution 2 . Constraints are used to check whether such an extension may lead to a 
possible solution of the CSP and to prune subtrees containing no solutions based on the 
current partial solution. During a backtracking search, the variables can be divided into 
three sets: past variables (already instantiated), current variable (now being instantiated), 
and future variables (not yet instantiated). A dead-end occurs when all values of the cur- 
rent variable are rejected as not leading to a full solution. In such a case, some instantiated 
variables become uninstantiated; i.e., they are removed from the current partial solution. 
This process is called backtracking. If only the most recently instantiated variable becomes 
uninstantiated then it is called chronological backtracking; otherwise, it is called backjump- 
ing. A backtracking algorithm terminates when all possible assignments have been tested 
or a certain number of solutions have been found. 

A backtracking search may be seen as a search tree traversal. In this approach we 
identify tuples (assignments of values to variables) with nodes: the empty tuple is the root 
of the tree, the first level nodes are 1-tuples (representing an assignment of a value to a 
single variable), the second level nodes are 2-tuples, and so on. The levels closer to the 
root are called shallower levels and the levels farther from the root are called deeper levels. 
Similarly, the variables corresponding to these levels are called shallower and deeper. We 
say that a backtracking algorithm visits a node in the search tree if at some stage of the 
algorithm's execution the current partial solution identifies the node. The nodes visited 
by a backtracking algorithm form a subset of all the nodes belonging to the search tree. 
We call this subset, together with the connecting edges, the backtrack tree generated by a 
backtracking algorithm. 

The backtracking algorithm conflict-directed backjumping (CBJ) (Prosser, 1993b) main- 
tains a conflict set for every variable. Every time an instantiation of the current variable 
Xi is in conflict with an instantiation of some past variable Xh, the variable Xh is added to 
the conflict set of X{. When there are no more values to be tried for the current variable Xi, 
CBJ backtracks to the deepest variable Xh in the conflict set of X{. At the same time, the 
variables in the conflict set of Xi, with the exception of Xh, are added to the conflict set of 
Xh, so that no information about conflicts is lost. 

Throughout the paper we refer to the following backtracking algorithms (see Kondrak 
& van Beek, 1997; Prosser, 1993b for detailed explanations and examples of most of these 
algorithms): chronological backtracking (BT), backjumping (BJ) (Gaschnig, 1978), conflict- 
directed backjumping (CBJ) (Prosser, 1993b), forward checking (FC) (Haralick & Elliott, 
1980; McGregor, 1979), forward checking and conflict-directed backjumping (FC-CBJ) 

2. Throughout this paper, we assume that a static value ordering is used in the backtracking search. 
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(Prosser, 1993b), maintaining arc consistency (MAC) (Gaschnig, 1978; Sabin & Freuder, 
1994), and maintaining arc consistency and conflicted-directed backjumping (MAC-CBJ) 
(Prosser, 1995). 

3. Variable Ordering Heuristics and Backjumping 

In this section, we present theoretical results that deepen our understanding of the rela- 
tionship between the look-ahead technique of using a variable ordering heuristic and the 
look-back technique of CBJ. 

In previous work, Kondrak and van Beek (1997) show that, given the same deterministic 
static or dynamic variable ordering heuristic, CBJ never visits more nodes than BT. Bacchus 
and van Run (1995) show that BJ, a restricted version of CBJ, visits exactly the same nodes 
as BT if the fail-first dynamic variable ordering heuristic is used. Previous empirical work 
shows that the number of nodes that CBJ saves depends on the variable ordering heuristic 
used (Bacchus & van Run, 1995; Bessiere & Regin, 1996; Prosser, 1993b). 

We show that, given a CSP and a variable ordering for CBJ, there exists a "perfect" 
variable ordering for the chronological backtracking algorithm (BT) such that BT never 
visits more nodes than CBJ. The more that a variable ordering heuristic is consistent with 
the "perfect" heuristic, the less chance CBJ has to reduce the search effort. 

We first consider the case of insoluble CSPs. When CBJ is applied to an insoluble CSP, 
it always backjumps from a dead-end state; i.e., it does not terminate or backjump from a 
situation in which a solution of the CSP was found. 

Lemma 1 Given an insoluble CSP and a variable ordering for CBJ, there exists a variable 
ordering for BT such that BT never visits more nodes than CBJ to show that no solution 
exists. 

Proof In the backtrack tree generated by CBJ under the variable ordering, let the last 
backjump that terminates the execution of CBJ be from variable Xj to the root of the 
backtrack tree. We choose Xj to be the first variable for BT. For each value a in the domain 
of Xj, if the current node in the backtrack tree for CBJ is consistent (not a leaf node), the 
next variable chosen to be instantiated after assigning a to Xj is the variable that backjumps 
to Xj and causes the assignment Xj <— a to be revoked. The entire variable ordering for 
BT can be worked out in a similar, recursive manner. For this variable ordering for BT to 
be well-defined, it remains to show that if the current node in the backtrack tree for CBJ 
is inconsistent (a leaf node), the corresponding node in the backtrack tree for BT is also 
inconsistent (and therefore no next variable needs to be chosen). We show that the variables 
skipped in the variable ordering constructed for BT are irrelevant to the dead-end states 
encountered by CBJ. Suppose at a stage we have ordered the variables to be instantiated 
for BT as Xj 1 ,...,Xj k , and for value a £ dom(xj k ) we choose the next variable Xj k as 
the variable which backjumps to the current variable Xj k in the CBJ backtrack tree. We 
prove by induction that the conflict set of Xj k used in the backjumping is subsumed by 
{xj 1 , . . . , Xj k }. k = 1 is the case of the last backjump that terminates the execution of CBJ. 
The hypothesis is true because the conflict set of Xj 1 is an empty set. Suppose it is true for 
the case of k > 1. Because Xj k backjumps to Xj k , the conflict set of Xj k is merged in the 
conflict set of Xj k . From the inductive assumption, the conflict set of Xj k is subsumed by 
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{xj 1 , . . .,Xj k _ 1 }, and thus the conflict set of Xj k is subsumed by {xj 1 , . . .,Xj k }. Therefore, 
the hypothesis holds for the case of k + 1. If CBJ finds out that instantiation x, Jk <— a is 
inconsistent with the assignments of some past variables which are added to the conflict 
set of Xj k , BT is also able to find out the inconsistency because the conflict set of x, Jk is 
subsumed by {xj 1 , . . .,Xj k _ 1 }. Thus, the variable ordering for BT is well-defined. | 

For soluble CSPs, we further distinguish the problem between finding one solution and 
finding all solutions. 

Lemma 2 Given a CSP and a variable ordering for CBJ to find the first solution, there 
exists a variable ordering for BT such that BT never visits more nodes than CBJ to find 
the first solution. 

Proof Without loss of generality, let {x\ <— a\, . . . , x n <— a n } be the first solution found. A 
variable ordering for BT can be constructed in the following way. The first variable chosen 
for BT is x\ as it is the first variable in the path from the root to the solution in the CBJ 
backtrack tree. Because we assume a static value ordering in the backtracking search, all 
values in the domain of x\ that precede value a\ must be rejected by CBJ and BT before 
value ai is used to instantiate x\. Furthermore, because {x\ <— ai,...,x n <— a n } is the 
first solution encountered by CBJ under the above variable ordering and value ordering, 
the instantiation of x\ with a value preceding a\ leads to an insoluble subproblem and 
eventually CBJ backjumps from a deeper variable to x\ to revoke that assignment. Note 
that x\ cannot be skipped by a backjump from a deeper variable because x\ is on the first 
level of the search tree and there is a solution for the CSP. Assigning x\ with each of the 
values that precede a\ in its domain leads to insoluble subproblems and the instantiation 
order for BT can be arranged as in Lemma 1. Whenever Xj~ is instantiated with value a^, 
Xk+i is chosen to be the next variable, as it follows Xj~ in the path from the root to the 
solution in the CBJ backtrack tree. Again, all values in the domain of Xk+i that precede 
<2fc_|_i in the value ordering must be rejected by CBJ and BT before ctfc+i is assigned to 
Xk+i- The instantiation of Xk+i with each of these values leads to an insoluble subproblem 
and eventually CBJ backjumps from a deeper variable to Xk+i- Similarly, Xk+i cannot 
be skipped by a backjump from a deeper variable because otherwise at least one of the 
assignments to xi,...,Xj~ must be changed so that {x\ <— ai,...,x n <— a n } is not the 
first solution encountered by CBJ. In each of these insoluble subproblems, the instantiation 
order for BT can be arranged as in Lemma 1. Finally, x n is instantiated with a n and BT 
finds the solution. | 

When CBJ is used to find all solutions, special steps must be taken to handle the con- 
flict sets. The problem here is that the conflict sets of CBJ are meant to indicate which 
instantiations are responsible for some previously discovered inconsistency. However, after 
a solution is found, conflict sets cannot always be interpreted in this way. It is the search 
for other solutions, rather than an inconsistency, that causes the algorithm to backtrack. 
We need to differentiate between two causes of CBJ backtracks: (1) detecting an incon- 
sistency, and (2) searching for other solutions. In the latter case, the backtrack must be 
always chronological; that is, to the immediately preceding variable. A simple solution is to 
remember the number of solutions found so far when a variable is chosen to be instantiated, 
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and later when a dead-end state is encountered at this level, we compare the recorded num- 
ber with the current number of solutions. A difference indicates that some solutions have 
been found in this interval of search, and forces the algorithm to backtrack chronologically. 
Otherwise the algorithm performs a normal backjumping by analyzing the conflict set of 
the current variable. 

Lemma 3 Given a CSP and a variable ordering for CBJ to find all solutions, there exists 
a variable ordering for BT such that BT never visits more nodes than CBJ to find all 
solutions. 

Proof Let the first solution found by CBJ be {x\ <— ai,...,x n <— a n } in the order of 
x\, . . . , x n . We first construct the variable ordering for BT as it is applied to find the first 
solution. However, because BT follows a strict chronological backtracking, it will inevitably 
visit all the nodes {x\ <— a\, . . .,Xj_i <— aj_i,Xj <— a'j}, where 1 < j ' < n and a!- comes 
after a,j in the domain of Xj. If CBJ skips any of these nodes, for example, from a deeper 
level variable Xh to while the instantiations of x\, . . . , Xj have not been changed, BT 

will possibly visit more nodes than CBJ. We will show this cannot happen by induction 
on the distance between the current level j and the deepest level n. After CBJ has found 
the solution at level n, it will try other values for x n and eventually backtrack to x n _\. So 
the nodes at level n cannot be skipped. Suppose it is true for the case of level J + 1 and 
now we consider the case of level j. Because Xj <— aj was not skipped in the backjumping, 
if aj is the last value in its domain, CBJ will backtrack to Xj_i because the number of 
solutions has been changed. So it is true for the case of j. Otherwise CBJ will change 
the instantiation of Xj to the next value in its domain. Let the current partial solution be 
t = {x\ <— a\, . . . , Xj_i <— aj_i,Xj <— a'j}. If the subtree rooted by t contains solutions, 
from the inductive hypothesis, CBJ will not skip this node because it is on level j. If 
the subtree rooted by t contains no solution, there exists a backjump from a deeper level 
variable Xh to escape this subtree. Could it jump beyond Xj such that t is skipped? In that 
case, the conflict set of Xh is subsumed in {x\, . . .,Xj-i}. From the definition of conflict 
set, we know that the current instantiations of the variables in the conflict set cannot lead 
to a solution. However the current instantiations of {x\, . . .,Xj-i} do lead to a solution, 
<— ai, . . . , x n <— a n }. That is a contradiction. So the conflict set of Xh must contain 
Xj and thus the node t at level j cannot be skipped. After all the values in the domain 
of Xj have been tried, CBJ will chronologically backtrack to Xj_i because the number of 
solutions has changed. Thus, Xj_i <— aj_i will not be skipped. The hypothesis is true for 
the case of any level j. Then we construct the variable ordering for BT in the following way: 
If the current partial solution t = {x '•} cannot be extended 

to a solution, we construct a variable ordering for the insoluble subproblem. If t can be 
extended to a solution, we construct a variable ordering for BT as the case of finding the 
first solution in this subproblem, and recursively apply the above steps until a backjump 
to level Xj changes the instantiation Xj <— a'j. Under the above variable ordering, BT will 
never visit more nodes than CBJ. | 

Theorem 4 Given a CSP and a variable ordering for CBJ, there exists a variable ordering 
for BT such that BT never visits more nodes than CBJ in solving the CSP. 
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V V V V V V 



Figure 1: An illustration of the variable ordering constructed for BT from a CBJ backtrack 
tree (for the CSP shown upper left). 



Proof Follows from Lemmas 1, 2, and 3. || 

Example 1 Figure 1 shows the BT backtrack tree based on the variable ordering constructed 
from the execution of CBJ to solve a CSP under a (hypothetical) dynamic variable ordering. 
The first solution found by CBJ is {x\ <— 0, xi <— 0, x^ <— 2, x$ <— 0, X4 <— 0}. Thus, BT 
first instantiates x\ and xi to 0. The node {x\ <— 0, xi <— 0, £3 <— 0} and {x\ <— 0, xi <— 
0, X2 <— 1} in the CBJ backtrack tree lead to insoluble subproblems. The variable ordering 
for BT at each of these nodes is constructed as in the case of insoluble CSPs. For example, 
in the CBJ backtrack tree, the last backjump to revoke the node {x\ <— 0, xi <— 0, £3 <— 0} 
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is from x$ to x%, so the next variable instantiated in BT at this node is x$. Under such an 
ordering, BT avoids instantiating x 4 and visits fewer nodes than CBJ. Then BT instantiates 
X3 to 2, X5 to 0, and X4 to 0, and finds the first solution. 

We have shown that there exists a "perfect" variable ordering such that CBJ becomes 
redundant. Of course, the "perfect" ordering would not be known a priori, and in practice, 
the primary goal in designing variable ordering heuristics is not to simulate the execution of 
CBJ, but to reduce the size of the overall backtrack tree. As an example, the popular fail- 
first heuristic selects as the next variable to be instantiated the variable with the minimal 
remaining domain size (the size of the domain after removing values that are in conflict 
with past instantiations) as this can be shown to minimize the size of the overall tree under 
certain assumptions. A secondary effect, however, is that variables that have conflicts with 
past instantiations are likely to be instantiated sooner, thus approximating the "perfect" 
ordering and diminishing the effects of backjumping. 

4. Maintaining Consistency and Backjumping 

In this section, we present theoretical results that deepen our understanding of the relation- 
ship between the look-ahead technique of maintaining a level of local consistency during 
the backtracking search and the look-back technique of CBJ. 

In previous work, Kondrak and van Beek (1997) show that, given the same deterministic 
static or dynamic variable ordering heuristic, CBJ never visits more nodes than BT and 
FC-CBJ never visits more nodes than FC. Prosser (1993a) shows that the removal of an 
inconsistent value from the domain of a variable can diminish the effects of CBJ and that 
CBJ can visit fewer nodes than an algorithm that combines CBJ with the discovery and 
removal of some inconsistent values. Previous empirical work shows that the number of 
nodes that CBJ saves depends on the level of local consistency maintained (Bacchus & van 
Run, 1995; Bessiere & Regin, 1996; Prosser, 1993b). 

We extend the partial ordering of backtracking algorithms presented by Kondrak and 
van Beek (1997) to include backtracking algorithms and their CBJ hybrids that maintain 
levels of local consistency beyond forward checking, including the important algorithms that 
maintain arc consistency. We show that CBJ and an algorithm that maintains strong k- 
consistency in the backtracking search are incomparable in that each can be exponentially 
better than the other. This result is refined by using the concept of backjump level in 
the execution of a backjumping algorithm and showing that an algorithm that maintains 
strong ^-consistency never visits more nodes than a backjumping algorithm that is allowed 
to backjump at most k levels. Thus, as the level of local consistency that is maintained in 
the backtracking search is increased, the less that backjumping will be an improvement. 

In Section 4.1, we consider the backjumping algorithms and define the series of algo- 
rithms BJfc. In Section 4.2, we consider the look-ahead algorithms that maintain a level of 
local consistency and define the series of algorithms MC^. Finally, in Section 4.3, we con- 
sider the relationships between the backjumping and the look-ahead algorithms and their 
hybrids. The reader who is not interested in the technical proofs of the results should jump 
directly to this section. 
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X\ + X 2 < X3 

X\ + X 3 > X 5 + 1 

X2 — X4 > X5 



Xl, . . . , X 5 G {0, 1, 2} 




Figure 2: An illustration of backjump levels in a CBJ backtrack tree (for the CSP shown 
upper right). 



4.1 Backjump Level and BJ^ 

To analyze the influence of the level of consistency on the backjumping, we need the notion of 
backjump level. Informally, the level of a backjump is the distance, measured in backjumps, 
from the backjump destination to the "farthest" dead-end. 

Definition 6 (backjump level, Kondrak & van Beek, 1997) The definition of back- 
jump level is recursive: 

1. A backjump from variable X{ to variable Xh is of level 1 if it is performed directly from a 
dead-end state in which every value of X{ fails a consistency check. 

2. A backjump from variable X{ to variable Xh is of level d > 2, if all backjumps performed 
to variable X{ are of level less than d, and at least one of them is of level d — 1. 

Example 2 Figure 2 shows the backjump levels in an example CBJ backtrack tree. There is 
a one-level backjump from x$ to x% because every value in the domain of x^ fails a consistency 
check. Then CBJ finds two solutions for the problem and thus it chronologically backtracks 
from X4 to X5, and later to X3. The backjumps are of level one and two respectively. At last 
there is a three-level backjump from x% to xi- 

By classifying the backjumps performed by a backjumping algorithm into different levels, 
we can now weaken CBJ into a series of backjumping algorithms which perform limited 
levels of backjumps. BJ^ is a backjumping algorithm which is allowed to perform at most 
&-level backjumps and it chronologically backtracks when a j-level backjump for j > k is 
encountered 3 . BJ n is equivalent to CBJ, which performs unlimited backjumps, and BJi is 

3. BJi is only of theoretical interest since in practice one would use CBJ rather than artificially prevent 
backjumping; i.e., one has to actually add code to prevent backjumping. 
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equivalent to Gaschnig's (1978) BJ, which only does the first level backjumps or backjumps 
from dead-ends. 

One may immediately conclude that BJ^+i is always better than BJ^ because it does one 
more level of backjumps. However, to be more precise, we need to justify that a situation 
where BJ^ may skip a node visited by BJ^+i does not exist. Similar to a result by Kondrak 
and van Beek (Theorem 11, 1997), we can show that: 

Theorem 5 BJk visits all the nodes that BJk+i visits. 
4.2 Maintaining Strong ^-consistency (MC^) 

Although backtracking algorithms that maintain arc consistency (or a truncated form of arc 
consistency called forward checking) during the search have been well-studied, a backtrack- 
ing algorithm that maintains strong ^-consistency (MCjt) has never been fully addressed in 
the literature. In order to study the relationship between BJ^ and MC^, we need to specify 
precisely the MC^ algorithms. 

A generic scheme to maintain a level of local consistency in a backtracking search is to 
perform at each node in the search tree one full cycle of consistency achievement. A consis- 
tency achievement algorithm is applied to the CSP which is induced by the current partial 
solution. If, as a result, the induced CSP becomes empty after applying the consistency 
algorithm, the instantiation of the current variable is a dead-end and should be rejected. 
If the resulting CSP is not empty, the instantiation of the current variable is accepted and 
the search continues to the next level. 

The simplest form of an induced CSP is to restrict the domains of the instantiated 
variables to have only one value and leave the set of constraints unchanged. This idea can 
be traced back to Gaschnig's (1978) implementation of MAC, referred to as DEEB; i.e., 
Domain Element Elimination with Backtracking. However, in order to establish a relation 
between BJ^ and MC^, we need a more restricted definition of the induced CSP, where the 
constraints in the induced CSP are the selections and projections of the constraints in the 
original CSP with respect to a partial solution. 

Definition 7 (induced CSP) Given a consistent partial solution t of a CSP P, the CSP 

induced by t, denoted by P\ t , has all the variables in P except those instantiated by t, 
the domain of each variable is the same as in P, and for each constraint C in P where 
vars(C) £ vars(t), there is a constraint C = ir vars (c)-vars(t)((Tt[vars(C)nvars(t)](C)) in P\ t . 

Example 3 Consider the graph coloring problem and the corresponding CSP shown in 
Figure 3. The original CSP has four variables, x\, . . . , X4, where x\, X2, £3 G {r, g, b} and 
X4 G {r}, and five binary constraints, x\ / xi, x\ / x%, xi / X3, xi / X4 and x% / X4. 
Given a partial solution t = {x\ <— g, xi <— b}, the CSP induced by t, P\ t , has two variables, 
£3 and X4, and the unary and binary constraints shown in Figure J h 

The maintaining strong ^-consistency algorithm (MCjt) at each node in the backtrack 
tree applies a strong ^-consistency achievement algorithm to the CSP induced by the 
current partial solution. Under such an architecture, FC can be viewed as maintaining 
one-consistency, and for binary CSPs, MAC can be viewed as maintaining strong two- 
consistency. 
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An algorithm enforcing strong ^-consistency on a CSP instance should detect and remove 
all those inconsistencies t = {x\ <— cii, . . . , Xj <— cij-i} where 1 < j ' < k and t is consistent 
but cannot be consistently extended to some j th variable Xj. To remove an inconsistency, 
we make it inconsistent in the resulting CSP by removing values from domains, removing 
inconsistent tuples from existing constraints, or adding new constraints to the CSP. 

We use the concept of a A;-proof-tree in characterizing the tuples that are removed by a 
strong ^-consistency achievement algorithm. 

Definition 8 (£;-proof-tree) A k -proof-tree for a partial solution t over at most k vari- 
ables in a CSP is a tree in which each node is associated with a partial solution over at most 
k variables in the CSP, where (1) the root of the k-proof-tree is associated with t, and (2) 
each leaf node of the k-proof-tree is inconsistent in the CSP, and (3) each non-leaf node s 
of the k-proof-tree is consistent in the CSP, and the children of s at the next level are nodes 
s' U {x <— ai}, . . . , s' U {x <— a{\ such that s' C s, x (j£ vars(s), and dom(x) = {a\, . . . , a{\. 

Example 4 Figure 3 shows a three-proof-tree (more than one is possible) for t = {x\ <— g} 
in the given graph coloring problem. Each non-leaf node, including the root t, is consistent, 
and each leaf node is inconsistent in the CSP. Since we have constructed a three-proof- 
tree for the tuple t it cannot be part of a solution to the CSP and a strong 3-consistency 
achievement algorithm would remove it. 

In general, if a A;-proof-tree for an inconsistency in a CSP can be constructed, an al- 
gorithm achieving strong ^-consistency would deduce and remove the inconsistency. After 
applying a strong ^-consistency achievement algorithm on the CSP, if all the children of 
a node in the A;-proof-tree are inconsistent in the resulting CSP, that node is also incon- 
sistent in the resulting CSP because one of its subtuples cannot be consistently extended 
to an additional variable. Because all the leaf nodes in the A;-proof-tree are inconsistent in 
the original CSP, in a bottom-up manner the inconsistency of the root of the tree can be 
deduced and removed from the resulting CSP. As a special case, if a A;-proof-tree for the 
empty inconsistency in a CSP can be constructed, the CSP will be empty after enforcing 
strong ^-consistency since every way to extend a variable has been shown to lead to an 
inconsistency (and therefore, each value would be removed from the domain resulting in the 
empty domain). On the other hand, after a CSP has been made strongly ^-consistent, if a 
partial solution t over at most k variables is inconsistent in the resulting CSP, a A;-proof-tree 
for t in the original CSP can be constructed. If t is inconsistent in the original CSP, the 
A;-proof-tree contains the single node t. Otherwise, t or a subtuple t' of t cannot be extended 
to an additional variable x; i.e., all the partial solutions t' U {x <— a\}, . . . , t' U {x <— a;}, 
where dom(x) = {a\, . . . , a;}, are inconsistent in the resulting CSP. Then we can construct 
the A;-proof-tree recursively for each of those inconsistencies. As a special case, if a CSP 
is empty after enforcing strong ^-consistency, a A;-proof-tree for the empty inconsistency in 
the original CSP can be constructed. 

The following lemmas (Lemma 6 to Lemma 8) reveal some basic properties about in- 
duced CSPs and strong ^-consistency enforcement on induced CSPs, which are used in the 
proofs of Theorem 10 and Theorem 14. 
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Figure 3: A three-proof-tree for {x\ <— g} in the graph coloring problem. All leaf nodes in 
the proof-tree are inconsistent in the CSP. 



Lemma 6 Given a CSP P and two partial solutions t and t' of P, if t C t' , then P\ t i = 
(P\t)\t>-f 

Proof Clearly P\ t t and (P\t) \t'-t have the same set of variables and the same set of domains. 
Because 7W s ( C )_ TOrs ( f /)0vC = ^vars^-varsit'^t'-ti^vars^-vars^tC), for each constraint 
C in P, the same selection and projection are made in P\ t t and (P\ t )\t'-t- Therefore, P\ t t 
and (P\ t )\t'-t have the same set of constraints. ff 



Lemma 7 Given a CSP P and a consistent partial solution t of P, if (i) P is empty after- 
achieving strong k-consistency, or (ii) there exists a variable x £ vars(t) such that the value 
t[x] is removed from the domain of x when achieving strong k-consistency on P, then P\ t 
is empty after achieving strong k-consistency, 

Proof We first show that, given a consistent partial solution t of a CSP P, and a A;-proof- 
tree T for an inconsistency s in P, there is a corresponding well-defined A;-proof-tree T t for 
the inconsistency s' = s[vars(s) — vars(t)], in the induced CSP P\t, provided s does not 
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£3 G {r, g, b}, x 4 G {r} 




C(x 3 ): {(r),(b)} 
C(x 3 ): {(r),(«7)} 
C(x 4 ) : {(r)} 



C(a;3, £4) : £3/^4 



£3 <— & 



6 



x 3 
X4 



r 



r 



Figure 4: Proof-tree for the empty inconsistency in the CSP P\t induced by t = {x\ <— 
g, X2 <— b} constructed from the proof-tree for {x\ <— g} in the CSP P shown in 
Figure 3. 



contain any assignments that are inconsistent with the assignments in t. T t is constructed 
from T in three steps (see Figure 4 for an example): (Step 1) Remove from T all nodes and 
their descendants which contain assignments that are inconsistent with the assignments in 
t. (Step 2) Replace each remaining node t' in T with the node t" = t'[vars(t') — vars(t)]; 
i.e., remove those variables which occur in t and thus do not occur in P\ t . If t' is not a 
leaf node in T, then by definition t' is consistent in P. It is possible that the corresponding 
node t" in T t is inconsistent in P\ t . Should this be the case, we make t" into a leaf node by 
removing all of its descendants. If t' is a leaf node in T, then by definition t' is inconsistent 
in P; i.e., there exists a constraint C in P such that t' does not satisfy C. It must be 
the case that vars(C) <£. vars(t) (since vars(C) C vars(t) contradicts the fact that t' is 
inconsistent with C and t is consistent and therefore consistent with C, but t' and t agree on 
their assignments by Step 1). Hence, there is a corresponding constraint C in P\ t which is 
the selection and projection of C in P. Now, it is easy to verify that the corresponding node 
t" is also inconsistent with C and is therefore a well-defined leaf node. (Step 3) Remove all 
subsumed nodes from T, where node ti is subsumed by node t\ if ti is a (necessarily only) 
child of t\ and varsity) C vars(ti). All children of a subsumed node ti are made children 
of the parent of ti- 

Now, suppose P is empty after achieving strong ^-consistency. Then there is a A;-proof- 
tree for the empty inconsistency in P and we can construct a A;-proof-tree for the empty 
inconsistency in P\ t . Therefore, P\ t is empty after achieving strong ^-consistency. Suppose 
there exists a variable x £ vars(t), such that the value t[x] is removed from the domain of 
x when achieving strong ^-consistency on P. Then there is a A;-proof-tree for {x <— t[x]} in 
P and we can construct a A;-proof-tree for the empty inconsistency in P\ t . Therefore P\ t is 
empty after achieving strong ^-consistency. g 
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Lemma 8 Given a CSP P and an assignment {x <— a}, a £ dom(x), if the induced CSP 
P\{ x ^- a j is empty after achieving strong (k — l)-consistency, then the value a is removed 
from the domain of x when achieving strong k-consistency on P. 

Proof Suppose P\{ x <_ a j is empty after achieving strong (k — Inconsistency. Thus, there is 
a (k — l)-proof-tree for the empty inconsistency in P\{ x <_ a y We now convert the (k — 1)- 
proof-tree to a A;-proof-tree for {x <— a} in P. Each node t in the original (k — l)-proof-tree 
is replaced by t U {x <— a}. Thus, the root of the tree becomes {x <— a}. Furthermore, 
if t is not a leaf node in the original (k — l)-proof-tree; i.e., t is consistent in P\{ x <_ a j, it 
is easy to verify that t U {x <— a} is consistent in P. If t is a leaf node in the original 
(k — l)-proof-tree; i.e., t is inconsistent in P\{ x <-a}i there is a constraint C in P\{ x <-a} such 
that t does not satisfy C . Let C be the selection and projection of the constraint C in P. 
Thus, t U {x <— a} does not satisfy the constraint C in P and is therefore inconsistent in P. 
Hence, we have constructed a A;-proof-tree for {x <— a} in P and thus a would be removed 
from the domain of x when achieving strong ^-consistency on P. | 

MCfc extends the current node if the CSP induced by the current partial solution is not 
empty after achieving strong ^-consistency. The node is thus called a k-consistent node. 

Definition 9 (^-consistent node) A node t in the search tree is a ^-consistent node if 
the CSP induced by t is not empty after enforcing strong k-consistency. A node which is 
not k-consistent is called k -inconsistent. 

Lemma 9 If node t is k-consistent, its ancestors are also k-consistent. 

Proof Let t' be one of i's ancestors. Because t' C t, from Lemma 6, P\t = {P\t')\t-t'- Thus, 
P\t is an induced subproblem of P\t>- From Lemma 7, if P\t is not empty after achieving 
strong ^-consistency, P\ t i is not empty either after achieving strong ^-consistency. Thus, t' 
is ^-consistent. | 

The following theorem applies to the case of finding all solutions. 

Theorem 10 If MC\ visits a node, then its parent is k-consistent. If a node is k-consistent, 
then MC\ visits the node. 

Proof The first part is true because MC^ would not branch on this node if its parent was 
found ^-inconsistent. We prove the second part by induction on the depth of the search tree. 
The hypothesis is trivial for j ' = 1. Suppose it is true for j > 1 and we have a ^-consistent 
node t at level j + 1. Let the current variable be x. From Lemma 9, £'s parent t' at level 
j is ^-consistent. Thus, MC^ will visit t' . From Lemma 6, P\t = (P\t') \{x<-t[x]}- Because 
\{x<-t[x]} i s n °t empty after achieving strong ^-consistency, from Lemma 7, value t[x] 
will not be removed from the domain of x when achieving strong ^-consistency in P\t>- As 
a consequence, MC^ will visit t. | 

A necessary and sufficient condition for MC^ to visit a node t is that i's parent is k- 
consistent and the value assigned to the current variable by t has not been removed from 
its domain when enforcing strong ^-consistency on i's parent. 
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Theorem 11 Given a CSP and a variable ordering, MC\ visits all the nodes that MCk+i 
visits. 

Proof Follows from Theorem 10 and Lemma 7. | 
4.3 Relationship Between BJ^ and MC^ 

Kondrak and van Beek (1997) have shown that for binary CSPs, BJ (BJi) visits all the 
nodes that FC (MCi) visits, and FC-CBJ (MCi-CBJ) and CBJ are incomparable. We 
extend their partial ordering of backtracking algorithms to include the relationship between 
MCfc, BJfc, and MC^-CBJ, 1 < k < n. All of our results are for the case of general CSPs; 
i.e., they are not restricted to binary CSPs. 

We begin by characterizing an important property of the CBJ algorithm. 

Lemma 12 If CBJ performs a one-level backjump from a deeper variable X{ to a shallower- 
variable Xh, the node th at the level of Xh is one-inconsistent. 

Proof Let Si be the conflict set of X{ used in the backjumping in which Xh is the deep- 
est variable. We show that X{ will experience a domain wipe out when enforcing one- 
consistency on the induced CSP P\ th [s t ]- Each node ti at the level of X{ is a leaf node; 
i.e., ti is inconsistent in P. Suppose ti does not satisfy constraint C where X{ £ vars(C) 
and vars(C) C Si U {x{\. The selection of C in P\ th [s t ], which constrains only one variable 
{xi}, should prohibit value ti[xi\ of Xi. Thus, Xi will experience a domain wipe out when 
enforcing one-consistency on P\ th ^.-j. Note that P\t h is an induced subproblem of P\ th ^.y 
From Lemma 7, P\t h is empty after enforcing one-consistency. Thus, th at the level of Xh 
is one-inconsistent. | 

Lemma 13 If CBJ performs a k -level backjump from a deeper variable Xi to a shallower- 
variable Xh, the current node th at the level of Xh is k -inconsistent. 

Proof Let Si be the current conflict set of Xi in which Xh is the deepest variable. We show 
that if there is a &-level backjump from Xi to Xh, then P\t h [Si] 1S empty after enforcing strong 
^-consistency and thus th is ^-inconsistent. The proof is by induction on A;, k = 1 is true 
from Lemma 12. Suppose the hypothesis is true for the case of k — 1 but it is not true for 
the case of k; i.e., there is a &-level backjump from Xi to Xh, but the induced CSP P\t h [Si] 
is not empty after enforcing strong ^-consistency. So there is at least one value a left in the 
domain of Xi after enforcing strong ^-consistency on P\ th ^.-j. We know that the node ti at 
the level of Xi instantiating Xi with a is either incompatible with th (i.e., it is a leaf node) 
or is /-level backjumped from some deeper variable Xj, for some 1 < / < k (see Figure 5). 
However, ti cannot be a leaf node as otherwise a would be removed from the domain of Xi 
when enforcing strong ^-consistency. Let Sj be the conflict set of Xj. From the hypothesis, 
the induced CSP P^^] is empty after achieving strong /-consistency. Because value a is 
not removed from the resulting CSP, from Lemma 8, the induced CSP P\t h [s t ]u{x t ^a\ is not 
empty after achieving strong (k — Inconsistency. Because ti[Sj] C ^[S;] U {xi <— a}, the 
induced CSP -Plt^] is not empty after achieving strong (k — Inconsistency. That leads to 
a contradiction. Thus P\t h [Si] 1S empty after achieving strong ^-consistency and th at the 
level of Xh is ^-inconsistent. | 
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Figure 5: A scenario in the CBJ backtrack tree used in the proof of Lemma 13. 

Theorem 14 Given a CSP and a variable ordering, BJk visits all the nodes that MC\ 
visits. 

Proof The proof is by induction on the level of the search tree. If MC^ visits a node at 
level j in the search tree, BJ^ visits the same node, j = 1 is trivial. Suppose that it is true 
for the case of j > 1 and we have a node t visited by MC^ at level J + 1. We know both 
MCfc and BJ^ visit i's parent at level j. The only chance that t may be skipped by BJ^ is 
that BJfc backjumps from some deeper variable X{ at level i to a shallower variable Xh at 
level h, such that h < j + 1 < i. Thus, the node at level h is ^-inconsistent (by Lemma 13). 
Since the node at level h is an ancestor of t and we know i's parent is ^-consistent from 
Lemma 9, the node at level h is ^-consistent. That is the contradiction. Therefore, BJ^ 
visits t at level j + 1. j| 

MCfc can be combined with backjumping, namely MC^-CBJ, provided the conflict sets 
are computed correctly after achieving strong ^-consistency on the induced CSPs. 

Theorem 15 Given a CSP and a variable ordering, MC\ visits all the nodes that MC'k-G'BJ 
visits. 

Proof Because MC^-CBJ behaves exactly the same as MC^ in the forward phase of a 
backtracking search, it is easy to verify that MC^-CBJ visits a node t only if i's parent 
is ^-consistent and the value assigned to the current variable by t was not removed from 
its domain when achieving strong ^-consistency on i's parent. Therefore, MC^-CBJ never 
visits more nodes than MC^ does. § 

In Figure 6, we present a hierarchy in terms of the size of the backtrack tree for BJ^, 
MCfc, and MC^-CBJ. If there is a path from algorithm A to algorithm B in the figure, 
we know that A never visits more nodes than B does. For example, MC^ never visits 
more nodes than BJj, for all j < k. Otherwise, there are instances to show A may be 
exponentially better than B, and vice versa. 
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Figure 6: A hierarchy for BJ^, MC^, and MC^-CBJ in terms of the size of the backtrack 
tree. 



As the following example shows, for any fixed integer k < n, there exists a CSP instance 
such that CBJ visits exponentially fewer nodes than an algorithm that maintains strong 
^-consistency in the backtracking search 4 . 

Example 5 Given a fixed integer k, we can construct a binary CSP with n + k + 2 variables, 
Xi, . . ., x n _ k+1 , y u . . ., y k+1 , x n _ k+2 , . ..,x n+1 , where dom(x t ) = {1, . . . , n} for 1 < i < n+1 
and dom(yj) = {1, ...,&} for 1 < j < k + 1. The constraints are: X{ / Xj, for i / j , and 
y i y rjJ for i / j . The problem consists of two separate pigeon-hole subproblems, one over 
variables x\, . . . , x n+ i and the other over variables yi, . . . , y k +i, and is insoluble. As we can 
see, the pigeon-hole problem is highly locally consistent. The first subproblem is strongly n- 
consistent and the second is strongly k-consistent. Under the above static variable ordering, 

4. Independently, Bacchus and Grove (1999) present a similar example to show that given a fixed k, CBJ 
may be exponentially better than an algorithm called MIkC, which essentially maintains ^-consistency 
in the backtracking search. 
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a backtracking algorithm maintaining strong k-consistency would not encounter a dead-end 
until x n _k+i is instantiated. Then it would find that the subproblem of x n _k+2i ■ ■ ■ > x n+i 
is not strongly k-consistent. Thus, the algorithm will backtrack before it reaches the second 
pigeon-hole subproblem. It will explore ^| nodes at level n — k + 1 of the search tree and 
thus take an exponential number of steps to find the problem is insoluble. CBJ does not 
encounter a dead-end at the level of x n _k+i an d it continues to the second pigeon-hole 
problem. Eventually it will find the second-pigeon hole problem is insoluble and backjump 
to the root of the search tree. The total number of nodes explored is bounded by a constant, 
0((k+l) k ), for a fixed k. Therefore, CBJ can be exponentially better than an algorithm 
maintaining strong k-consistency. 

Example 5 also shows that, although MC^ visits fewer nodes than BJ^ by Theorem 14, 
BJfc_|_i can be exponentially better than MC^. However, BJ^+i can be better than MC^ 
only if there is a (k + l)-level backjump that is not also a chronological backtrack. To see 
that this is true, suppose that on a particular instance all (k + l)-level backjumps are also 
chronological backtracks (i.e., the backjump is to the immediately preceding variable in the 
variable ordering and only that single variable becomes uninstantiated and is removed from 
the current partial solution). In this case, the freedom to backjump one additional level 
rather than chronologically backtrack does not make a difference and BJ^+i is effectively 
BJfc and thus cannot be better than MC^. Thus, BJ^+i can be better than MC^ only 
if there is a (k + l)-level non-chronological backjump. We note, however, that since the 
number of backjumps of level k + 1 is less than or equal to the number of backjumps of level 
k, as k increases this gets more and more unlikely. Thus, as the level of local consistency 
that is maintained in the backtracking search is increased, the less that backjumping will 
be an improvement. 

Consider Example 5 again. At each level of the backtrack tree for MC^, the instantiation 
of each of the past variables removes one distinct value from the domain of the current 
variable (recall that MC^ never instantiates the variable y\ as it reaches a dead-end at 
x n _k+i)- If we were to maintain conflict sets for the variables, the conflict set for the current 
variable would include all of its past variables and thus when a dead-end is encountered 
by the algorithm, any backjump computed from the conflict sets would also necessarily be 
a chronologically backtrack. Thus, as this example shows, MC^-CBJ and MC^ can visit 
exactly the same nodes and consequently BJ^+i can be exponentially better than MC^- 
CBJ. Furthermore, because MCfc_i-CBJ can reach the second pigeon-hole problem without 
encountering a dead-end, it can finally retreat from the second pigeon-hole problem to the 
root of the search tree by backjumps. Thus, MCfc_i-CBJ may be exponentially better 
than MCfc-CBJ. In particular, this shows the surprising result that MAC-CBJ can visit 
exponentially more nodes than FC-CBJ. 

Finally, as the following example shows, for any fixed integer k < n, there exists a CSP 
instance such that an algorithm that maintains strong ^-consistency in the backtracking 
search visits exponentially fewer nodes than CBJ. 

Example 6 Consider the CSP as defined in Example 5, but searched with the static variable 
ordering y t , . ..,y k ,xi, . . ., x n+1 , y k+1 . 
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5. Empirical Evaluation of Adding CBJ to GAC 

In this section, we report on experiments that examined the effect of adding CBJ to a 
backtracking algorithm that maintains generalized arc consistency (GAC), an algorithm 
that we refer to as GAC-CBJ. Previous work has shown the importance of algorithms that 
maintain arc consistency (e.g., Sabin & Freuder, 1994; Bessiere & Regin, 1996). We show 
that adding CBJ to a backtracking algorithm that maintains generalized arc consistency 
can speed up the algorithm by several orders of magnitude on hard, structured problems. 

Previous empirical studies of adding CBJ to a backtracking algorithm that maintains a 
level of local consistency have led to mixed conclusions. Adding CBJ to forward checking, 
a truncated form of arc consistency, has been shown to give improvements but not always 
significant ones. Prosser (1993b) observes that with a static variable ordering, FC-CBJ is 
about three times faster than FC on the Zebra problem. Smith and Grant (1995) observe 
that with a dynamic variable ordering, adding CBJ to FC led to significant savings but 
only on hard random problems that occur in the easy region. Bacchus and van Run (1995) 
observe that with a dynamic variable ordering, adding CBJ to FC only led to at most a 
5% improvement on the Zebra problem, ra-Queens problems, and random binary problems. 
Bayardo and Schrag (1996, 1997) show that adding CBJ to the well-known Davis-Putnam 
algorithm, the SAT version of forward checking, can be a significant improvement on hard 
random and real-world 3-SAT problems. 

Adding CBJ to an algorithm that maintains full arc consistency has received less at- 
tention in the literature. In the one study that we are aware of, Bessiere and Regin (1996) 
observe that adding CBJ to MAC (the binary version of GAC) actually slows down the 
algorithm on random binary problems due to the overhead of maintaining the conflict sets. 
They conjecture that "when MAC and a good variable ordering heuristic are used, CBJ 
becomes useless". 

Our empirical results lead us to differ with Bessiere and Regin's conclusion about the 
usefulness of adding CBJ to an algorithm that maintains full arc consistency. In our imple- 
mentation we were able to significantly reduce the overhead of maintaining the conflict sets 
through the use of additional data structures 5 . On problems where adding CBJ does not 
lead to many savings in nodes visited, our implementation of CBJ also does not degrade per- 
formance by any significant factor. We demonstrate the improvement by re-doing Bessiere 
and Regin's (1996) experiments on random binary problems. We then show through exper- 
iments in two structured domains that GAC-CBJ can sometimes improve GAC by several 
orders of magnitude on hard instances. 

In our experiments, we ran both GAC and GAC-CBJ on each instance of a problem 
and recorded the CPU times. Comparing CPU times is appropriate as the underlying code 
for GAC and GAC-CBJ is identical, with GAC-CBJ containing only additional code to 
maintain the conflict sets and to determine how far to jump back. Two dynamic variable 
orderings were used: the popular dom+deg heuristic which chooses the next variable with 
the minimal domain size and breaks ties by choosing the variable with the maximum degree 
(the number of the constraints that constrain that variable) and the dom/deg heuristic 
proposed by Bessiere and Regin (1996) which chooses the next variable with the minimal 

5. See the online appendix for the source code and a description of the key data structures in our imple- 
mentations of GAC and GAC-CBJ. 
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value of the domain size divided by its degree. All experiments were run on 400 MHz 
Pentium IPs with 256 Megabytes of memory. 

5.1 Random Problems 

The run time performance of GAC and GAC-CBJ were compared on sets of randomly 
generated binary CSPs. A set of random problems is defined by a 5-tuple (n,d,r,m,t), 
where n is the number of the variables, d is the uniform domain size, r is the uniform arity 
of the constraints, m is the number of randomly generated constraints, and t is the uniform 
tightness or number of tuples in each constraint. In each case, the constraint tightness t 
was chosen so that approximately half of the instances in the population were insoluble; 
i.e., the instances were from the phase transition region. 

Table 1: Effect of domain size on average time (seconds) to solve random instances from 
(50, d, 2, 95, t). Each set contained 100 random instances. Both GAC-CBJ and 
GAC used the dom/deg variable ordering. 



d 


GAC-CBJ 


GAC 


ratio 


5 


0.0027 


0.0030 


0.90 


10 


0.026 


0.027 


0.96 


15 


0.10 


0.10 


1.00 


20 


0.41 


0.41 


1.00 


25 


0.79 


0.78 


1.01 


30 


2.46 


2.47 


1.00 


35 


3.82 


3.80 


1.01 


40 


10.98 


10.75 


1.02 



Bessiere and Regin (1996) examine the effect of domain size on the average time to 
solve random instances from (50, d, 2, 95, t) (see Figure 5 (right) in Bessiere & Regin, 1996). 
With their implementation of CBJ, adding CBJ steadily worsens performance as domain 
size increases until at d = 40 MAC-CBJ is about 1.7 times slower than MAC alone. With our 
implementation, the difference in performance between GAC-CBJ and GAC was negligible 
on these problems (see Table 1). 

The remaining sets of random problems that Bessiere and Regin used in their experi- 
ments to compare the performance of MAC-CBJ and MAC are now too simple to provide 
a meaningful comparison as they can be solved in less than 0.01 seconds on a 400 MHz 
Pentium II computer. Thus, we chose harder sets of random binary problems. On each 
instance we ran both GAC and GAC-CBJ and recorded the CPU times. Here we report 
the average ratio of the CPU times (GAC over GAC-CBJ). Each set contained 100 random 
instances. On the first set of problems, (150, 5, 2, 750, 19), the average ratio for the dom+deg 
variable ordering was 0.90 and the average ratio for the dom/deg variable ordering was 0.88. 
On the second set of problems, (150, 5, 2, 1500, 21), the average ratios for both the dom+deg 
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and dom/deg variable orderings was 0.93. In other words, on average GAC was a little over 
10% faster than GAC-CBJ on these problems. 

5.2 Planning Problems 

Planning, where one is required to find a sequence of actions from an initial state to a goal 
state, can be formulated as a CSP. In the formulation we used in our experiments, each 
state is modeled by a collection of variables and the constraints enforce the assignments of 
variables to represent a consistent state or a valid transition between states. (See Kautz & 
Selman, 1992; van Beek & Chen, 1999 for more details on the formulation of planning as a 
CSP.) 

Table 2: Time (seconds) to solve instances of the grid planning problem. The absence of an 
entry indicates that the problem was not solved within 72000 seconds (20 hours) 
of CPU time. 





dom+deg 


dom/deg 




GAC 


GAC-CBJ 


GAC 


GAC-CBJ 


1 


0.66 


0.68 


1.58 


0.86 


2 


762.47 


33.33 


3965.10 


321.17 


3 

4 




1753.13 






5 











In the experiments we used all 130 instances used in the First AI Planning Systems 
Competition, June 6-9, 1998. The instances come from five different domains: gripper, 
mystery, mprime, logistics, and grid. In the experiments we report, both GAC and GAC- 
CBJ were based on AC3 (Mackworth, 1977a) as this was found to give the best performance. 

For the gripper, mystery, and mprime domains, each of the instances could be solved 
in under 25 seconds by both GAC and GAC-CBJ. On these easy problems, the increased 
overhead of CBJ rarely led to savings, and overall GAC was 10-15% faster than GAC-CBJ. 

Table 2 shows the comparison between GAC and GAC-CBJ in solving the 5 instances 
of the grid problems. GAC-CBJ showed improvement on the grid problems. For example, 
it solved problem 4 in about half an hour, but GAC failed to find a solution in 20 hours. 

Table 3 shows the comparison between GAC and GAC-CBJ in solving the 30 instances 
of the logistics problem. On about one third of the instances, GAC-CBJ improved on GAC. 
For example, on instances 18, 20 and 27, GAC-CBJ ran several orders of magnitude faster 
than GAC, and on instance 15, GAC exhausted the 20 hours time limit but GAC-CBJ found 
a solution within 3 minutes. GAC-CBJ and GAC performed similarly on easier instances 
and sometimes GAC-CBJ was about 10% slower than GAC. 



75 



Chen & van Beek 



Table 3: Time (seconds) to solve instances of the logistics planning problem. The absence 
of an entry indicates that the problem was not solved within 72000 seconds (20 
hours) of CPU time. 





dom+deg 


dom/deg 




GAC 


GAC-CBJ 


GAC 


GAC-CBJ 


-1 

1 


0.03 


0.03 


0.03 


0.03 


o 
z 


0.03 


0.05 


0.03 


0.06 


3 


10.91 


0.86 


9.63 


0.81 


A 

4 


0.16 


0.17 


0.14 


0.18 


r 




1.51 


1.54 


1.54 


1.57 


6 


36.49 


16.86 


35.77 


16.76 


i 


0.08 


0.08 


0.08 


0.09 


o 



0.15 


0.15 


0.14 


0.16 


9 


0.30 


0.33 


0.32 


0.33 


10 










11 


0.04 


0.05 


0.05 


0.05 


12 


0.11 


0.13 


0.11 


0.11 


13 


0.54 


0.57 


0.54 


0.56 


14 


U.Do 


U.D4 


U.D4 


U.Do 


15 




182.51 




8540.58 


16 


12.49 


0.42 


12.32 


0.41 


17 


264.46 


0.32 


261.33 


0.32 


18 


15382.82 


1165.54 


15157.71 


1184.67 


19 


1.29 


1.37 


1.33 


1.31 


20 


6268.16 


27.66 


6125.87 


28.55 


21 


0.66 


0.70 


0.68 


0.74 


22 










23 










24 


0.08 


0.09 


0.08 


0.09 


25 


34.03 


13.03 


11.58 


12.10 


26 










27 


12239.26 


47.06 


12105.62 


47.76 


28 










29 










30 
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5.3 Crossword Puzzle Problems 

Crossword puzzle generation, where one is required to fill in a grid with words from a 
dictionary, can be formulated as a CSP. In the formulation we used in our experiments, each 
of the unknown words is represented by a variable which takes values from the dictionary. 
Binary constraints enforce that intersecting words agree on their intersecting letter and 
that a word from the dictionary appears at most once in a solution. Figure 7 shows an 
example 5x5 crossword puzzle grid. A CSP model of this grid has 10 variables, 21 binary 
"intersection" constraints, and 13 "not equals" constraints. 



1 


2 


3 






4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


18 






19 


20 


21 



Figure 7: A crossword puzzle. 

In the experiments we used 50 grids and two dictionaries for a total of 100 instances of 
the problem that ranged from easy to very hard. For the grids, we used 10 instances at each 
of the following sizes: 5x5, 15x15, 19x19, 21x21, and 23x23. For the dictionaries we used 
the UK dictionary, which collects about 220,000 words and in which the largest domain for a 
word variable contains about 30,000 values, and the Linux dictionary, which collects 45,000 
words and in which the largest domain for a word variable has about 5,000 values. In the 
experiments we report, both GAC and GAC-CBJ were based on AC7 (Bessiere & Regin, 
1997) as this was found to give the best performance (see Sillito, 2000 for a discussion of 
integrating AC7 into backtracking search). 

Figure 8 shows approximate cumulative frequency curves for the empirical results, where 
we are plotting the ratio of the time taken to solve an instance by GAC over the time 
taken to solve the instance by GAC-CBJ. Thus, for example, we can read from the curve 
representing the dom+deg variable ordering that for approximately 85% of the tests adding 
CBJ had little effect and that for the remaining 15% of the tests it led to orders of magnitude 
improvements. We can also read from the curves the 0, 10, . . . , 100 percentiles of the data 
sets (where the value of the median is the 50th percentile or the value of the 50th test). The 
crossover point, where GAC-CBJ starts to perform as well as or better than GAC occurs 
around the 35th percentile. Tables 4 and 5 examine the data more closely by showing the 
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o.i 1 1 1 1 1 1 1 1 1 1 1 

10 20 30 40 50 60 70 80 90 100 
test 

Figure 8: Effect on execution time of GAC of adding conflict-directed backjumping (GAC- 
CBJ). A curve represents 100 tests on instances of the crossword puzzle problem 
where the tests are ordered by the ratio of time taken to solve the instance by 
GAC over time taken to solve the instance by GAC-CBJ. 



actual times to solve the instances where GAC performed best and the instances where 
GAC-CBJ performed best. 

Table 4: GAC versus GAC-CBJ on instances of the crossword puzzle problem. The ten 
best improvements in time (seconds) of GAC over GAC-CBJ to solve an instance 
are presented. 





dom+deg 


dom/deg 


rank 


GAC 


GAC-CBJ 


GAC 


GAC-CBJ 


1 


1.21 


1.35 


1.11 


1.23 


2 


1.10 


1.20 


0.95 


1.02 


3 


6.12 


6.53 


1.16 


1.24 


4 


0.78 


0.81 


56.66 


60.36 


5 


110.23 


114.52 


1.30 


1.37 


6 


68.67 


71.28 


4.86 


5.11 


7 


47.16 


48.42 


0.22 


0.23 


8 


32.69 


33.63 


14.23 


14.76 


9 


25.17 


26.08 


74.38 


77.52 


10 


20.73 


21.37 


7.43 


7.67 
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Table 5: GAC versus GAC-CBJ on instances of the crossword puzzle problem. The ten 
best improvements in time (seconds) of GAC-CBJ over GAC to solve an instance 
are presented. The absence of an entry indicates that the problem was not solved 
within 36000 seconds (10 hours) of CPU time. 





dom+deg 


dom/deg 


rank 


GAC 


GAC-CBJ 


GAC 


GAC-CBJ 


1 




37.85 




54.60 


2 




41.43 


10311.32 


33.43 


3 




67.07 




225.92 


4 




82.58 




244.81 


5 




276.00 




308.04 


6 




542.80 




374.72 


7 




939.71 




832.68 


8 


2716.86 


115.87 




1486.43 


9 


390.91 


34.90 




1890.24 


10 




3336.37 




3411.83 



In summary, on some of the smaller, easier crossword puzzle instances GAC was slightly 
faster than GAC-CBJ, on many of the puzzles there was no noticeable difference, and on 
some of the larger, harder puzzles GAC-CBJ was orders of magnitude faster than GAC. 

6. Conclusion 

In this paper, we presented three main results. First, we showed that the choice of dynamic 
variable ordering heuristic can weaken the effects of the backjumping technique. Second, 
we showed that as the level of local consistency that is maintained in the backtracking 
search is increased, the less that backjumping will be an improvement. Together these 
results partially explain why a backtracking algorithm doing more in the look-ahead phase 
cannot benefit more from the backjumping look-back scheme and they extend the partial 
ordering of backtracking algorithms presented by Kondrak and van Beek (1997) to include 
backtracking algorithms and their CBJ hybrids that maintain levels of local consistency 
beyond forward checking. Third, and finally, we showed that adding CBJ to a backtracking 
algorithm that maintains generalized arc consistency can (still) speed up the algorithm by 
several orders of magnitude on hard, structured problems. Throughout the paper, we did 
not restrict ourselves to binary CSPs. 
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